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Abstract — Variable-rate data transmission schemes in which 
ooosteUation points are selected according to a nonuniform prob- 
abiOty distribution are studied. When the critmon is one of 
ttfaiimizing the average transmitted energy for a given aver- 
age btt rate, tiic best possible distribution with which to select 
oonsteliation points is a Maxwell-Boltzmann distribution. In 
piindple, when constellation points are selected according to 
a MaxweU-Bohzmann distribution, the ultimate shaping gain 
<irc/6 or 1^ dB) can be achieved In any dimension. Nonuni- 
form signaling schemes can be designed by mapping simple 
variable-length prefix codes onto the constellation. Using the 
Hnffinan procedure, prefix codes can be designed that approach 
the eptimai performance. These schemes provide a fixed-rate 
primifry channel and a variable-rate secondary channel, and are 
easfly mcorporatcd into standard lattice-type coded modulation 
schemes. 

Index Terms — Signal constellations, maximum entropy princi- 
ple, shaping gain, coded modulation, Huffinan codes. 



I. iNTRODUCnON 

IN THE CONVENTIONAL approach to data transmission, 
each point in a given constellation is equally likely to be 
transmitted. While this approach yields the maximum bit rate 
for a given constellation size, it does not take into account the 
energy cost of the various constellation points. In this paper, 
the idea of choosing constellation points with a nonuniform 
probability distribution is explored. Such nonuniform signaling 
will reduce the entropy of the transmitter output, and hence 
the average bit rate. However, if points with small energy 
are chosen more often than points with large energy, energy 
savings may (more than) compensate for this loss in bit rate. 

It foUows immediately from the maximum entropy prin- 
dpie (see, e.g., [1, ch. 11]), or by variational calculus as 
in [2, Section IV-B], that the probability distribution that 
maximizes entropy for a fixed average energy is one in which 
a constellation point r, with energy \\r\\^, is chosen with 
probability p(r) oc exp (-A||r|p), where the nonnegative 
parameter A governs the trade-off between bit rate and average 
energy. Nonuniform signaling with this family of distributions, 
well known in statistical mechanics and thermodynamics as 
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Maxwell-Boltzmann distributions [3], [4] or as Gibbs ensem- 
bles [5], is the focus of this paper. 

Nonuniform signaling is closely related to the notion of 
constellation shaping in coded modulation (as described in, 
e g-, [2], [6H^]). Constellation shaping can provide an energy 
savings called shaping gain in addition to the usual coding 
gain provided by lattice- or trellis-coding. Indeed, the gain G 
provided by a coded modulation system (relative to a simple 
pulse amplitude modulation baseline) operating at a bit rate 0 
(bits per two-dimensional channel-use) is well-approximated 
by writing 

where 7c and 7^ denote, respectively, the coding gain [10] 
and shaping gain [2] of the scheme in question. As will be 
shown, the discretization factor 1.-^ 2"^ properly adjusts the 
gain for finite bit rates. The connection between shaping and 
nonuniform signaling arises from the fact that, in schemes 
that employ shaping, a nonuniform distribution is mduced on 
the points of the low-dimensional constituent constellation. By 
applying nonuniform signaling directly (rather than indirectly 
via constellation shaping), nonuniform signaling can, m any 
dimension, achieve the ultimate shaping gam — ire/6 or 1^3 
dB — attainable with uniform signaling only in the limit of 
infinite dimension [2], Indeed, one of the principal results 
of this paper is a method of designing simple nonuniform 
signaling schemes that approach this ultimate level. 

It is important to note at the outset, however, that prac- 
tical implementation of direct nonuniform signaling will be 
hampered by the variable transmission rate of such schemes. 
Transmitting data obtained from a fixed-rate source requires 
data buffering at the transmitter and the receiver, which leads 
to the problem of coping with buffer over- or underflow. 
Furthermore, since the transmitted signals represent variable 
numbers of bits, channel errors may cause the insertion and 
deletion of bits in the decoded data, causing potential losses 
of synchronization. Although these system problems will tend 
to limit the broad applicability of nonuniform signaling, we 
do not attempt to provide solutions to these problems in this 
paper. 

Instead, our aim in this paper is 1) to provide insight 
into nonuniform signaling schemes and how they relate to 
conventional signaling schemes, 2) to assess the potential gaiiis 
that nonuniform signaling may provide, and 3) to provide 
a method (via the Huffman algorithm) by which simple, 
near-optimal, nonuniform signaling schemes may be designed. 
Such nonuniform signaling schemes provide a standard, fixed- 
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rate, primary channel unaffected by the system problems 
mentioned in the previous paragraph, together with a variable- 
rate secondary channel, in which these system problems may 
be acceptable. At various" places in this paper, the close 
analogy between these nonuniform signaling schemes and 
mathematically equivalent statistical mechanical systems will 
be pointed out. 

To motivate our interpretation of nonuniform signaling as 
a shaping method, we summarize the complementary notions 
.of coding and shaping in Section 11. In Section fU, we briefly 
define the coded modulation parameters that will be needed 
throughout the paper. In Section IV, we discuss various proper- 
ties of the Maxwell-Boltzmann distribution that are important 
in this setting. The results of applying this distribution to 
signal point selection from spherical constellations based on 
various dense lattices are given in Section V. In Section VI, 
we define and apply "continuous approximations" to show that 
the ultimate shaping gain of ?re/6, or 1.53 dB, is attainable 
in any dimension when constellation points are selected with 
a Maxwell-Boltzmann distribution. In Section VII, we show 
how the Huffman procedure may be used to obtain dyadic 
approximations to the Maxwell-Boltzmann distribution that 
provide near-ultimate shaping gains. In Section VIII, we dis- 
cuss the integration of nonuniform signaling with coset coding. 
Finally, we make some general comments and concluding 
remarks in Section IX. 



II. Coding and Shaping 

Coding and shaping are two separate and complementary 
operations that contribute to the gain of lattice-type coded 
modulation schemes such as lattice codes and lattice-type 
trellis codes. In implementation as well as in analysis, the two 
operations are dual and separable and provide two additive 
gain components: coding gain and shaping gain. We say that 
coding gain is a distance property of the coded moduladon 
scheme because it depends, in general, on the set of distances 
between the various transmitted signal sequences. Coding is 
generally performed to achieve a large minimum distance 
between signal-sequences, i.e., coding attempts to be distance- 
maximizing. Shaping gain, on the other hand, is an energy 
property of the coded modulation scheme because it depends, 
in general, on the energy of the various transmitted signal 
sequences. Shaping is generally performed to achieve a small 
average transmitted energy while maintaining the desired bit 
rate, i.e., shaping attempts to be energy-minimizing. 

In more general terms, coding and shaping attempt to solve 
two related, but different, problems. The coding problem is to 
find a large set of symbol sequences that can be distinguished 
with high reliability in the presence of noise. The shaping 
problem is to use these symbol sequences to deliver maximum 
information to the receiver at minimum cost where, in this 
paper, cost is measured by the average energy per transmitted 
symbol - 

Most lattice-type coded modulation schemes based on a 
lattice partition A/A' have the encoder structure shown in 
Fig. 1 (see [10]-[12]), The encoder consists of a coset code 
C and a signal point selector S. The coset code C produces 



input data- 



Sequence of cosets 
~i of K' in A 



to channel 



Fig. 1. Encoder structure for lattice-type coded modulation schemes. 

a sequence of sets of signals, drawn from the cosets of a 
sublattice A' in a lattice A. In general, each coset contains 
an infinite number of points. The role of the signal point 
selector S is to choose an actual constellation point to be 
transmitted from each infinite coset. The coset code C is 
usually chosen to maximize the minimum distance between 
different possible signal sequences. The signal point selector 
S usually attempts to minimize the average transmitted energy 
while supporting the desired bit rate. In light of our previous 
discussion, it should be clear that C performs the coding (or 
distance-maximizing) operation, while S performs the shaping^ 
(or energy-minimizing) operation. Both the coset code C and 
the signal point selector S contribute to the transmission of 
data in these coded modulation schemes. In this paper, the 
signal point selection or shaping component of these coded 
modulation schemes is studied. 

Shying schemes (or signal point selectors) may be clas- 
sified as being cither fixed-rate or variable-rate. Fixed-rate 
sdiemes achieve the transmission of a fixed number of bits 
over some well-defined signaling interval. Generalized cross 
constelladons [2], Voronoi constellations [6], block shaping 
codes [7], trellis shaping codes [8], and truncated polydisc con- 
stellations [9] are all examples of fixed-rate shaping schemes. 
From the point of view of maximizing shaping gain, the best 
possible iV-dimensional constellation shape is the iV-sphere, 
since it achieves a specified volume with least average energy. 

With variable-rate schemes, the number of bits transmitted 
during a signaling interval is a random variable. A simple 
example of such a scheme is given in [13], where a binary 
data stream is parsed into the codewords of a variable-length 
prefix code, which are then mapped onto the points of a 
constellation. (A more detailed account of this type of scheme 
is given in [14].) Other examples of variable-rate schemes 
that may be interpreted as mapping the words of a prefix code 
onto a constellation include the shaping schemes described 
by Livingston [15], the block-encoded modulation schemes 
of Chouly and Sari [16], and the signaling schemes with 
"opportunistic secondary channels" described by Forney and 
Wei [2] (see also [17]-[21]). It should be noted, however, that 
these latter schemes were not constructed with shaping gain 
in mind. 



Ill, Definitions 

In this secdon, definitions for various parameters used 
throughout this paper are provided. Most of these parameters 
arc carefully defined in [2] for the case of uniform signaling; 
here these definitions are extended to the case of nonuniform 
signaling. 

Throughout this paper, we deal with constellations Q em- 
bedded in an A/^-dimensional (ND) vector space with a well- 
defined (Euclidean) distance and norm. The size of Q, 
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is usually finite, but not necessarily so (as in the case of an 
infinite lattice). Often, Q. will be obtained as the intersection 
of a lattice A (or a translate a + A) with a finite region R, 
in which case we denote the constellation by f2(A, R). The 
energy (squared norm) of a point r^Q is denoted by \\r\\^. We 
shall usually assume that the transmitter produces a sequence 
of symbols drawn independently from the constellation and 
that the symbols are selected at some regular symbol rate. 
The probability with which the transmitter selects a point 
r e n is denoted by p(r). As usual in the study of data 
transmission schemes, we are concerned with trade-offs among 
three parameters: reliability, bit rate, and transmitter power, 

A. Reliability, Bit Rate, Transmitter Power 

Although the most natural reliability measure for symbol 
transmission in a noisy channel is, perhaps, Fe (the average 
symbol error rate), this measure is often difficult to compute 
and to work with, especially in the case of complicated mul- 
tidimensional constellations. A simple (and well-established) 
reliability measure for a signaling scheme on the Gaussian 
channel is the parameter d^i^, the minimum squared Euclidean 
distance between different constellation points. Formally, 

dl,^ - rmn{d\r,ryr,r'eQ^r^r'} 

where £i2(r, r') denotes the squared Euclidean distance be- 
tween constellation points r and r'. Schemes with greater d^,^ 
will tend to have smaller symbol error rate, at least for large 
SNR (signal-to-noise ratio), and hence greater reliability; thus, 
we will take d^j^ as the principal reliability measure in this 
paper. 

In fact, ff^in can be used to esUmate the symbol error rate at 
moderate to high SNR's. Let Nmin (r) denote the number of 
constellationpoints at distance from the point r. The error 
coefficient N is the average of Nmin over the constellation; 
that is, 

N i Y^p{r)Nmin{ry (1) 

Assuming a white Gaussian noise channel with a one-sided 
ooise power spectral density of Nq W/Hz, the dominant terra 
in a simple union bound on Pe, assuming maximum-likelihood 
decoding, gives us the estimate 

(2) 



where, as in [2], we have normalized to 2-D. For the special 
case of a uniform probability distribution over a finite A^- 
D constellation the bit rate is i2/N) logs 1^1 bitsy2-D 
channel-use, and this is the maximum bit rate for the given 
constellation. A signaling scheme with a normalized bit rate 
of p can send roughly 0 bits/s/Hz when implemented with a 
QAM (quadrature amplitude modulation) modem (which sends 
sequences of 2-D signals).^ 

Transmitter power is proportional to the average energy 
per transmitted symbol. The normalized average energy per 
symbol per two dimensions is given by 

^ (4) 



r€0 



B. Constellation Figure of Merit and Gain 

In any data transmission scheme, we would like to transmit 
at a large bit rate, with as high a reliability and as low a trans- 
mitter power as possible. A commonly used figure of merit for 
a signaling scheme, sometimes termed the "constellation figure 
of merit" or CFM [2], is the dimensionless, scale-invariant 

ratio CFM = cCLiJE. 

To compare the relative energy efficiency of two schemes, 
we use the estimate (2). ^- 

Let 



Pe. i{EINo) ^ NiQ{^/CFMiE/2No) 



and 



2(^/^0) ^ N2QWCFM2EI2NQ) 



denote symbol error rate estimates for two schemes operating 
at the same bit rate and having, respectively, <X)nsteUation 
figures of merit CFMi and CFM2 and error coefficients N 1 
and ATs. Then, P^Mv) and P^Mv) denote, respectively, the 
approximate E/N^ value needed by each of the two schemes 
to achieve the symbol error rate p. At a fixed symbol error rate 
p, the gain, G(p), of the first scheme relative to the second is 
given by the ratio of E/Nq values, i.e.. 



G{p)^PZ\{v)IKMv) _ _ 

= (CFM1/CFM2) X ^2) 



(5) 



where 



where Q(a:) _= (1/v^) exp (-^V 2) ^^u. It is important 
to note that iV, unlike d^i^, is affected by the probability with 
which constellation points are selected. 

In information-theoretic terms, the transmitter is a discrete 
memoryless source whose output alphabet is the set of points 
in the constellation. The (average) bit rate is equal to the 
entropy of the transmitter in bits per transmitted symbol. In 
formal terms, a scheme with a constellation in which the 
symbol r is selected independently with probability p(r), has 
bit rate 

0 t - — ^p(r) logsbWl bits/2-D channel-use (3) 



1n{p.NuN2) = [Q'\p/N2)/Q-Hp/Ni)r^ 

(This latter factor is easily evaluated via a convenient approxi- 
mation for Q"^(a;) due to Hastings [22] given in Abramowitz 
and Stegun [23, section 26.2.23].) For small values of x, 

Q-'{x)^y/\n{l/x^), (6) 
and hence the asymptotic gain 

G = HmG(p) = CFMi/CFM2 
p— 0 

» Actually, a spectral throughput of 0 bits per second per Hertz is achieved 
with QAM only by using ideal Nyquist pulse shaping. In practice, some excess 
bandwidth is needed, so ft represents an upper bound on spectral throughput. 
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is the ratio of constellation figures of merit, and is not 
affected by error coefficient values. At nonzero values of p, 
the asymptotic gain will be affected by the -y^- factor in (5). 
In particular, when A^i > the asymptotic gain. will be 
reduced. When p is small, we can again use (6) and write 

Jn{p. Ni, N2) [1 + log(iVi/772)/log(772/p)]-i (7) 

to estimate this effect. 

For bit rates 0 > 2 blts/2-D channel-use, we take as a 
baseline for comparison the CFM obtainable with a simple 1- 
D PAM (pulse amplitude modulation) constellation. Assuming 
that constellation points are selected with equal probability, 
this baseline has figure of merit given by CFM© == 6/(2^ — 1). 
The (asymptotic) gain of a scheme with a given CFM, relative 
to the baseline, is then given by 



G(dl,^,p,E) i CFM/CFMe 



eE 



0>2. 



(8) 



This gain measure effectively combines our reliabOity, bit 
rate, and transmitter power measures into a single figure, dius 
allowing for direct comparisons of disparate schemes. Note 
also that the gain formula (8) is valid only for 0 >2, i.e., for 
bit rates of at least one bit per symbol per dimension. Since 
we wish to study bandwidth-efficient schemes, i.e., those with 
laigc )5, this restriction will pose no problem, and we assume 
it to hold throughout this paper. 

While we will be interested primarily in asymptotic gain in 
this paper, we can use (5) to estimate the effect of the error 
coefficient at nonzero error probability values. In Appendix 
A, we show that a cubic N-D consteUation based on 2^ 
under uniform signaling and supporting a bit rate 0 (i.e., 
an N-D baseline constellation) has error coefficient 77© = 
2iV(l —^2"^/^). Thus, in two dimensions and for large values 
of 0, AT© ss; 4. Using (7), we estimate that every factor of 
two increase in the error coefficient (over the baseline of 4) 
reduces the asymptotic gain by about 0.2 dB for p on the order 
of 10"^, in agreement with the rule of thumb given by Forney 
[10, p. 1142]. (A plot of in dB versus logg 77 suggests 
that a loss of 0.22 dB per error coefficient doubling is a fairly 
accurate rule, up to about = 64, at p = 10~^. Similarly, 
at p = 10"^, the loss is about 0.17 dB per doubling in error 
coefficient.) 

It is important to note that when the irmer points of a 
constellation are selected more often than the outer points, 
the error coefficient will increase, because the inner points 
tend to have greater Nmin than the outer points. For example, 
consider a constellation drawn from the lattice, for which 
N < 4. In light of the previous paragraph, we can estimate 
the maximum loss in gain by 

0.22 Iog2 dB = -0.22 loga (1 - 2'^^^) dB 

for error ratesjon the order of 10~®. For 0^2, when the 
baseline has N = 2, this maximum possible degradation is 
quite large (0.22 dB) relative to the maximum achievable 
shaping gain (1.53 dB); however, for y9 ft; 6, the maximum 
degradation is only about 0.04 dB. At smaller error rates, the 



degradation is even less. As mentioned, we prefer to focus on 
the asymptotic gain, which is unaffected by variations in the 
error coefficient, but we caution the reader to note that the 
error coefficient must be accounted for in estimating the gain 
at nonzero error probabilities. 

Suppose now that the constellation Q is obtained from an 
infinite A''D lattice A, and that a point r G S7 is selected with 
probability p(r). li Q ^ A, it is convenient to extend the dis- 
tribution p{r) to all the points of A, simply by assigning zero 
probability to any points not in Q. Let V( A) be the volume of 
a fundamental region [24] of A, i.e., the volume of iV-space 
associated with each lattice point, and let 2^^^^^'^V(A) be 
the "entropy volume" of A with distribution p(r). We can 
then write G(cP^,^, 0, E) (8) as 

G{dl,^, A E) = 7c(Ah.(p)(l - 2-^^)), (9) 

where 

7c(A)=:^Ln(A)/nA)'/'^ (10) 

is the coding gain [10] of the lattice A, and 

7*(p) = 2^^>y(A)2/^/[6£;(p)] 

is the shaping gain of A with distribution p(r). For large bit 
rates 0, when the discretization factor 7a(p) = 1 — 2~^^^ 
is small, the total gain is approximately separable into the 
product of a coding gain and a shaping gain. The coding gain 
7c(A), a geometric property of the lattice A studied in the 
coded modulation literature and elsewhere [10], [11], [24], is 
independent of the probability distribution p{r) used to select 
constellation points and therefore not of central interest in this 
paper. The shaping gain 7«(p) is largely independent of the 
underlying lattice A, except insofar as the lattice restricts the 
distribution p(r); this is why we have suppressed an explicit 
dependence on A in our notation. In general [2], 75 (p) ;£ 7re/6. 
As we shall see, by choosing the distribution p(r) to be 
the Maxwell-Boltzmaim distribution, ^aip) can be made to 
approach the ultimate shaping gain of 7re/6 in any dimension. 

C. Other Constellation Parameters 

Often, higher-dimensional constellations are obtained as 
subsets drawn from Cartesian products of lower-dimensional 
"constituent" constellations. When the dimension iV of f2 is 
even, we may define the constituent 2-D constellation of Q. 
as the smallest 2-D constellation Q.2 such that €l c f^^^^, 
where 12^^^ is the A72-fold Cartesian product of ^2 with itself. 
The constituent 2-D constellation Q.2. plays an important role 
when the signaling scheme is to be implemented with a QAM 
modem, since all transmitted signals are obtained as sequences 
of QAM signals drawn from 

If n is odd-dimensional, then is even-dimensional and 
can be implemented with a QAM modem. This suggests 
that we may define the constituent 2-D constellation of Q. 
as the constituent 2-D constellation of fi^. In particular, if 
N = I, this implies Q2 ~ For later use, we note that the 
constituent 2-D constellation of Bf^(R), an N-baW of radius 
R centered at the origin, is a 2-D disk B2{R) when N is even, 
and is a square By{R) of side 2R when N is odd. 
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As discussed in [2] and [25], an important parameter in 
the design of a signaling scheme is the 2-D "constellation 
expansion ratio" 

CER2(n) = 1^21/2^, 

where 1^21 is the number of points in the constituent 2-D 
constellation of and 2^ represents the number of points in 
a comparable baseline constellation supporting the same bit 
rate with uniform signaling. Since 1^21 > l^l^^"^ > 2^, we 
have CER2(n) > 1. Large 2-D constellations are sensitive to 
nonllnearities and other signal-dependent perturbations, so it 
is desirable that CER2(^2) be as close to its lower bound of 
unity as possible. 

Another important parameter discussed in [2] is the 2-D 
"peak-to-average energy ratio" 

(11) 
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where the partition function Z(A) is chosen to normalize the 
distribution, i.e.. 



Z(A) ^ 53exp(-Al|r||2), 



A> 0. 



(12) 



PAR2(Q) ^ rl^{^2)/E, 



where is the energy maximum of the points in ^2, the 
constituent 2-D constellation of 12, and E is the normalized 
average energy. PAR2 is a measure of the dynamic range of 
the signals transmitted by a QAM modem. To minimize the 
effects of signal-dependent distortion, it is desirable that PAR2, 
like CER2, be as small as possible. (Note that the "peak" 
energy in this definition is found by averaging a signal over 
a 2-D interval, thus making PAR2 independent of the pulse 
shape used in implementation. The actual "instantaneous" peak 
energy depends on the actual pulses used, and on how these 
pulses superpose in time when transmitted in sequence.) 

Since constellations are often obtained by taking the in- 
tersection of an N'D lattice A with an N-D region R, in 
analogy with constituent 2-D constellations, it is useful to 
define constituent 2-D lattices and regions. We define the 
constitueat 2.D lattice A2 of A as die smallest 2-D set of 
A2 such that A2 c A$^- Similarly the constituent 2-D region 
of R is die smallest 2-D region R2 such that R^ c R2 - If 
iV = 1, then A2 = A^ and R2 = R^- It is clear that is 
obtained as the intersection of A2 with R2* 

IV. The Maxwell-Boltzmann distribution 
As pointed out in the Introduction, it follows im- 
mediately from the maximum entropy principle that the 
Maxwell-Boltzmann distribution maximizes bit rate for a 
fixed average energy. (For a good introduction to the 
maximum entropy principle, see [1, ch. 11].) Equivalently, the 
Maxwell-Boltzmann distribution minimizes average energy 
for a fixed bit rate.^ Signal point selection with a 
Maxwell-Boltzmann distribution causes a constellation point 
r, with energy \\r\\^, to be selected with probability p(r) oc 
eicp(-A||r|p), where the parameter A > 0 governs the trade- 
off between bit rate and average energy. More precisely, the 
optimal distribution is one in which 

p(r) ^ exp{-A||r||2)/Z(A), A > 0, 

2 In both optimization problems, the given constellation must be able to 
support the given bit rate or the given average energy, so these values arc 
themselves constrained; this is pointed out at the end of Section IV. 



(For infinite constellations, A must be strictly positive,) The 
Maxwell-Boltzmann distribution arises in many contexts; e.g., 
in the optimization of permutation moduladon for quantization 
[26] and for transmission [27], in neural networks [28], and in 
simulated annealing [29], among others. 

For finite constellations, setting A = 0 yields a signalmg 
scheme in which constellation points are selected with 
uniform probability; thus, "classical" fixed-rate signaling 
schemes appear here as a special case. Note too that, with 
a Maxwell-Boltzmann distribution, outer points (points with 
large energy) are never selected more often than inner points 
(points with small energy). An equivalence class of the points 
of ft all having the same energy is called a shell of the 
consteUaUon. With a MaxweU-Boltzmann distribution, the 
points of a sheU are selected equaUy often. Indeed, if all 
constellation points lie in the same shell, "classical" uniform 
signaling is obtained for all values of A. 

In statistical mechanics, much attention is paid to die 
computation of the partition function Z{X) (12)^in various 
physical systems. Hiis is due to the fact that the average energy 
and entropy are easily obtained in terms of Z( A).- Indeed, the 
normalized average energy (4) is obtained as 



_ 2 ^ ^dlnZjX) ^ 



dX 



(13) 



and the normalized bit rate (3) is obtained as 



(14) 



The partition function is easUy obtained in terms of the 
theta series [30] or Euclidean weight distribution [10] of a 
constellation. The theta series for a constellation Q is simply 
a generating function for the set of energy values (squared 
norms) taken on by the points of fl, and is defined as 
^ ^^^^ xW^W^, where we interpret e{x) as a real 
function of x, 'and note that Z(A) = e[exp(-A)]- 

The Maxwell-Boltzmann parameter A governs the trade-off 
between bit rate and average eneigy. In analogy with statistical 
mechanics, we might call A the "inverse temperature" of the 
Maxwell-Boltzmann distribution, i.e., A = l/{kT), where, 
in statistical mechanics, k is the Boltzmann constant and T 
is the temperature. When A = 0 (infinite "temperature"), the 
uniform distribution is obtained, corresponding to die maxi- 
mum possible entropy for the given consteUation. (in statistical 
mechanics, all states of a system are equally occupied at 
infinite temperature.) As A (X) (or the "temperature" cools 
toward absolute zero), the bit rate as well as the average energy 
are reduced as the points with large energy are selected less 
frequentiy. The "limiting constellation" (obtained at absolute 
zero "temperature") consists of only the innermost points 



BNSDOCID: <XP 383031 A > 



918 



IEEE TRANSACTIONS ON INFORMATION THEORY. VOL. 39, NO. 3. MAY 1993 



of the original constellation (the ground states in statistical 
mechanics), and these points are selected equally often. 

An important property of the Maxwell-Boltzmann distri- 
bution is its "separability" property. Suppose the ND con- 
stellation Q is the Cartesian product of two or more "factor 
constellations," i.e., H = f^i x x • • • X fij, J > 2, where 
is -dimensional and = Then it follows that 

J 

Z{X) = Y[Zi(X), (15) 

where Zi{X) is the partition function over the tth factor con- 
steUation, i.e., Zi{X) = Er.eft. exp(-A||ri||2). When (15) 
holds, the Maxwell— Boltzmann distribution with parameter A 
is separable into the product of Maxwellr^Itzmami distri- 
butions over the factor constellations, each with parameter A. 
In practice, this means that optimal nonuniform signaling can 
be implemented on separable constellations by independently 
implementing nonuniform signaling on each of the factor 
constellations. From (13) and (14) it follows that E and P 
can be obtained by a weighted average of the corresponding 
factor constellation quantities, i.e., E = EiNi/N and 

If the constellation Q, has |n| points in total, and 
Nin "innermost" points of minimum energy, then as A 
ranges firom 0 to +oo, every value of ^ in the range 
(2 logs {Nin)/N, 2 log2 (|ri|)/iVl is obtained, and we say 
that the constellation supports every bit rate in this range 
with a Maxweli-Boltzmann distribution. Similarly, over the 
same range of A values, the constellation supports average 
energy values in the range £?unif6rm], where Era\^ 

is the normalized energy of a minimum-energy point in fi, 
and £?uniform is the normalized average eneigy under uniform 
signaling. An infinite lattice can support any positive bit rate 
and any positive average energy with a Maxweli-Boltzmann 
distribution. 

V. Spherical CoNSTELXAnoNs 

In this section, we apply the Maxwell-Boltzmaim dis- 
tribution to spherical constellations based on the densest 
known lattices in various numbers N of dimensions. In one 
dimension, the constellations are based on the integer lattice Z, 
while in two dimensions, the constellations are based on j42, 
the hexagonal lattice. In higher dimensions, the constellations 
are based on the lattices denoted £?6, ^8, and Ki2\ the 
subscript in this notation displays the number of dimensions 
N, Basis vectors and extensive theta series tables for these 
lattices are given in [24, ch. 4]. Each constellation is obtained 
by taking some number M of the points of smallest energy in 
the lattice, where M is chosen so as to include some integral 
number of lattice shells. Equivalently, we may think of the 
constellations as being obtained by forming the intersection 
of the infinite lattice with an AT-sphere centered at the origin; 
hence the term spherical constellations.^ We note that, due to 

^Caution: some authors reserve this term to refer to cotzstellations in which 
all points arc on the surface of a sphere; in this paper, spherical constellations 
do. in general, include interior points as well. 




0 (bits/2D channe(-use) 

Fig. 2. Normalized gain of spherical constellations drawn from various 
dense lattices with signal point selection performed according to the 
Maxwell-Bolczmann distribution. 

the separability property of the Maxweli-Boltzmann distribu- 
tion, the results we obtain for these spherical constellations are 
also applicable to those nonspherical constellations that can be 
expressed as Cartesian products of spherical constellations. For 
example, JV-cube shaped constellations based on the integer 
lattice are Cartesian products of simple 1-D "spherical" 
constellations based on Z. 

Plotted in Fig. 2 is the "normalized" gain that selected 
constellations provide for various bit rates when constella- 
tion points are selected with a Maxweli-Boltzmann distri- 
bution. The normalized gain is obtained from the gain G, 
defined in (8), by dividmg by the coding gain Ac(A) (10), 
of the lattice from which the constellation is drawn. Each 
curve in Fig. 2 is obtained by varying the parameter A from 
zero — corresponding to the maximum bit rate (rightmost) point 
in each curve— through positive values. (Recall that A = 0 
corresponds to the "classical" case of a uniform distribution.) 
Curves corresponding to constellations drawn from the same 
lattice but extending further to the right, i.e., to larger bit rates, 
correspond to larger constellations. Also plotted in Fig. 2 is 
the function 

U{p) t (7re/6)(l - 2-^). (16) 

As we shall explain in Section VI, for bit rates 13 greater than 
about 2,5, U {P) forms the "upper envelope" of the gain curves, 
to good approximation. 

Fig. 2 has several noteworthy features. First, notice that the 
gain obtained by nonuniform signaling with a constellation can 
significantly exceed that provided under uniform signaling, at 
the expense of a reduction in bit rate. The additional gain 
provided under nonuniform signaling is called the "biasing 
gain" [7]. 

The curves corresponding to large constellations tend to 
merge with curves corresponding to smaller constellations as 
A is increased. This happens because the smaller constellations 
are subconstellations of the large constellations. As A is 
increased, the outer points of the large constellations are 
selected very infrequently, so that, in effect, these outer points 
can be neglected and the large constellation "shrinks" into a 
smaller constellation. 
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Note also that each curve tends to merge with the U{0) 
curve. Comparing (16) to (9), we see that this merging implies 
that the shaping gain under nonuniform signaling approaches 
the ulliraate limit of 7re/6 as A becomes large, and that this 
limit is obtained independently of dimension. 

Fig. 2 also illustrates the "law of diminishing returns" 
governing the biasing gain. Recall that the rightmost point 
of each curve (A = 0) in the graph corresponds to uniform 
signaling. We see that, for constellations having dimension 
greater than unity, some "initial" gain is available under 
uniform signaling. Furthermore, the initial gain increases with 
increasing dimension. This initial gain is, of course, the 
^laping gain of the JV-sphere under uniform signaling, which 
increases with N and ultimately approaches the value we/ 6. 
This forces the ultimate biasing gain (the difference between 
the ultimate shaping gain and the shaping gain of the N- 
sphere) to decrease with dimension. 

QuaUtatively, this law of diminishing returns arises due to 
a phenomenon known as the "sphere hardening effect" (see, 
e.g^ [31]). In a many-dimensional sphere, ahnost all of the 
volume is located near the surface of the sphere; consequently, 
ahnost all constellation points lie near or on the surface as well. 
Since these points all have the same energy, i.e., the same 
cost, signaling with a Maxwell-Boltzmann distribution will 
cause these points to be selected equally often. Thus, uniform 
fttgnaling with spherical constellations becomes increasingly 
effective as the dimension increases, ultimately approaching 
the performance of nonuniform signaling. 

VI. Continuous approximations 
Let n(A, R) be a constellation obtained from the inter- 
section of an iV-D lattice A (or a translate a -h A of A) 
with a finite AT-D region R. In [2], Forney and Wei were 
able to obtain much insight into the performance of such 
constellations via the so-called "continuous approximation," 
obtained by replacing discrete sums over the points of ft with 
properly normalized integrals over the region R. In this section 
we use the same approach to obtain similar insight in the 
case of nonuniform signaling, essentially by replacing discrete 
Maxwell-Boltzmann distributions with continuous Gaussian 
distributions, truncated to the region R. The continuous ap- 
proximation allows us to obtain a continuous approximation 
for the partition function Z(A), from which estimates of all 
other relevant system parameters are obtained. The result of 
Rmiey and Wei for uniform signaling [2] appear as a special 
case, obtained when the Maxwell-Boltzmann parameter A is 
set to zero. 

A Energy and Entropy Approximations 

Let /: R^ R, a function of N variables, be Riemann- 
integrable over an ND region R. Given an ND lattice A*, 
we have 

/(r) dVir) = Vim ^ /(r)V(aA^), (17) 

! r€n(aA-,R) 

where f2(aA*, R) = aA* n R, and ^(aA*) denotes the 
volume of a fundamental region of the scaled lattice aA*. For 



small a, we expect the summation on the right-hand side of 
(17) to be a good approximation to the integral on the left- 
hand side of (17). Setting A = aA* (a small), we obtain from 

(17) the general continuous approximation 

J2 f{r)^V{Ar'Jfir)dV{r). (18) 
For example, setting /(r) = 1, yields the approximation 
rencA, R) 

where \^(R) denotes the volume of the region R. This is 
"Proposition 1" of Forney and Wei [2]. 

For nonuniform signaling with a Maxwell-Boltzmann distn- 
bution, the average energy and bit rate arc determined by the 
partition function Z(A) (12). The continuous approximation 

(18) yields 

z{X) ^ vWj^ (-^ll^ll') ^^^^ 

Combining approximation (20) with (13) we approximate 
the normalized average energy (4), by 

E{R,X) ^ ^ljrff{T^X)dV{r)/ (21) 



where 



/(r, A) = 



exp(-A||r|l^) 



(22) 



/«exp(-A||rP)ciV(r)* 

Note that /(r, A) represents a continuous Gaussian probability' 
density function, truncated to the region R. The continuous 
approximation (21) estimates the normalized average con- 
stellation energy by the normalized average energy of this 
continuous random variable. 

In the same way, combining approximation (20) with (14), 
we approximate /?, the normalized bit rate (3), by 

(23) 



where 



^(R, A) ^ ^[H{R,X)-loe2V(A)], 
H{R, X) = - f fir, A) log2 /(r, X)dV{r) 



is the differential entropy of a continuous Gaussian random 
variable, trurcated to the region R. The continuous approxi- 
mations (21) and (23) were also used by Forney and Wei [2, 
section IV-B], and by Forney in [6, section V]. 

B, Shaping Gain Approximation 

We know use (21) and (23) to estimate the gain provided 
by nonuniform signaling. Writing ^(R, A) for /3, and E{R, A) 
for E in the gain expression (8) yields the approximation 



G(R, A) where 



G(R, A) ^ ^ 



6E{R, A) 

- [viA)y^\ [ 6E(R, A) J 

^^^_2-^(R.A)) (24) 
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As in (9), we have grouped the gain G(R, A) (24) into three 
factors— the first being the coding gain 7c(A) =cf^in/^( A)^/^' 
of the lattice A [10], the second being a continuous approxi- 
mation for the shaping gain, namely, 

and the third being the discretization factor 

7d(R, A) i i-2-^(«'^). 

The discretization factor was omitted in the analysis of Forney 
and Wei [2], who were interested in asymptotic — *• oc) 
limits for the shaping gain. However, for most practical values 
of ^, this factor is not insignificant and should not be omitted. 
Indeed, as will become evident, including this factor provides 
accurate estimates for gain, even for relatively small values of 
/3 (see Fig. 3 and Section VI-D). 

We, now estimate the biasing gain [7], i.e., the additional 
gain that nonuniform signaling can provide over uniform 
signaling with the same constellation. Setting A = 0 reduces 
(25) to the special case of uniform signaling discussed in [2], 
The shaping gain 7a(R, A) can be written in terms ofjs(R, 0) 
as 

= 7-(R, 0)-2-''('''^).5^(R, A), 



where 



p(R, A) ^ mo)-y5(R, A) 



(26) 



(27) 



is the normalized redundancy (or loss in bit rate) caused by 
selecting constellation points with a nonuniform distribution. 
Clearly, 76(R, A), the total biasing gain, is given by the 
product of the second and third factors in (26), i.e.. 



76(R, A)==^^(R, A)2-^(«'^). 



(28) I.e., 



R, and V{A2) is the fundamental volume (actually area) o 
Aa, the constituent 2-D lattice of A. Combining this with ou 
approximation (23) for the bit rate, we obtain 

CER2(n) ^ [K(A)2/^/l/(A2)] . [V^(R2)/2(2/N)K(R, A)j^ 

Here, as in [2], we identify two independent factors: the coding 
constellation expansion ratio of A, CER2c(A) = y(A)2/^'/t/(A2) 
and the shaping constellation expansion ratio, CER23(R, A) = 
V^(R2)/2CVA^)^<«>^) Again, as in the case of gain, one 
component, the coding constellation expansion factor, is a 
geometric property of the lattice A and is unaffected by the 
probability distribution with which the constellation points are 
selected. The other component, the shaping constellation ex- 
pansion ratio, depends both on the region R and the parameter 

When A = 0, we obtain the special case of uniform sig- 
naling, where 

CER2,(R, 0) = F(R2)/F(R)2/^. 

We may write CER2,(R, A) in terms of CER2^(R, 0) as 

CER2,(R, A) = CER2,(R, 0) • 2'*("'^)' 

where p(R, A) (27) is the normalized redundancy under 
nonuniform signaling. We see that in addition to the 
constellation expansion due to uniform signaling, we have 
incurred an additional constellation expansion factor due to 
the loss in rate caused by nonuniform signaling. Thus, for 
a fixed consteUation, CER2,(R, 0), the shaping constellation 
expansion ratio induced by uniform signaling, is a lower bound 
to CER2,(R, A), the shaping consteUation expansion ratio 
under nonuniform signaling. 

To estimate PAR2(a) (11), we note that the peak energy 
of the constituent 2-D constellation is a geometric property 
unaffected by the probability with which constellation points 
are selected. The average energy, on the other hand, is reduced 
from its value under uniform signaling by the energy savings 
factor gsi^, A); thus PAR2 is increased by the same factor. 



In (28), we have identified two separate factors that charac- 
terize the biasing gain. The "energy savings factor" 

5s(R, A) ^ EiR, 0)/E{R, A) > 1, (29) 

accounts for the energy savings that result when constellation 
points of low energy are selected more often that points of 
large energy. Of course, selecting points with a nonuniform 
distribution results in a loss of entropy and hence a drop in 
the baseline average energy. The "energy loss factor" 2"-p("» 
accounts for this drop. 

C. CER2 and PAR2 Approximations 

We now provide continuous approximations for CER2 and 
FAR2, two constellation parameters defined in Section III. 
From (19), we estimate IHsj « V^(R2)/V^(A2) for the size 
of the constituent 2-D constellation, where V^(R2) is the 
volume (actually area) of R2, the constituent 2-D region of 



PAR2(R, A) = PAR2(R, 0)p£:(R, A), 
where PAR2(R, 0) denotes the PAR2 under uniform signaling. 

£). Applying the Continuous Approximations 

In applying these continuous approximations to actual con- 
stellations, one is confronted with a certain flexibility in 
the choice of approximating region R, For example, to ap- 
proximate the behavior of an M point, symmetric PAM 
constellation based on Z, one would choose R = [—R, H], 
a ID "sphere" of radius R\ however, it is not clear which 
choice for radius R is best. Indeed, the "best" choice for R 
depends both on the constellation parameter — be it average 
energy, entropy, or whatever— that one is trying to estimate, 
and on the value of the Maxwell-Boltzmann parameter A. The 
same flexibility in choice of sphere radius R occurs when one 
attempts to approximate the behavior of an A^-D spherical 
constellation with an A^-sphere B/v(i^). 
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Since the size, of the constellation is assumed known, 
one approach is to choose the radius R so as to match the 
bit rate estimate (23) at A = 0 with the actual bit rate 
(2/N) logs 1^1 at A = 0. In effect, this forces the volume 
of the region R to satisfy ^^(R) = |^^|^(A), so that (19) is 
satisfied with equality. From extensive numerical calculations, 
we have found that this approach gives very satisfactory 
estimates for bit rate and average energy, over a fairly wide 
range for A. 

Applying this approach to the M-PAM example, we find 
that R = M/2 and E ^ M^/G. However, the actual energy 
E = (M^ - l)/6; thus, using the continuous approximation, 
we have incurred an error that is a factor of (1 — Af ~ ) = 
1 - 2~^, The discretization factor, 7d(^) = 1 - 2~^, can thus 
be interpreted as a correction factor used to adjust the average 
energy when applying the continuous approximation to the 
baseline constellations under uniform signaling. 

E. Spherical and Cubic Constellations 

Continuous approximations for the various constellation 
parameters are derived in Appendix B for the important case in 
vMch the region R = Bn{R)> an iV-ball of radius R centered 
at the origin. These include cubic constellations, which are 
Cartesian products of 1-D "spheres," as a special case. 

In order to compare the shaping gain predicted by these 
cOTtinuous approximations to actual shaping gain, we have 
plotted in Fig. 3 normalized gain curves for spherical con- 
stellations Ct in various dimensions. As in Fig. 2, the gain 
vahies are normalized by dividing the total gain by the coding 
gain of the lattice from which the constellation is drawn. Solid 
curves represent the actual normalized gain G/7c, computed 
&om (8). The dotted curves give (|Q|2/^2-^(B>.(H). A) _ 
l)/{6£?(Bn(H), A)], as defined in Appendix B. The radius R 
in each case is chosen so that V{Bt^{R))/V{A) = \ni Also 
plotted in Fig. 3 is the function U{P) (16) which represents 
the **upper envelope" shown in the figure. 

We see that the curves corresponding to the approximate 
normalized gain closely match the actual normalized gain 
curves for all values of /? > 2, although some difference is seen 
for small /3. For large /3, however, the curves corresponding to 
the approximation correspond with the actual normalized gain 
curves, confirming the asymptotic accuracy of the continuous 
approximation. 

As pointed out in Appendix B, for large bit rates, the shaping 
gain under nonuniform signaling approaches 7re/6, indepen- 
dently of dimension. The ultimate biasing gain approaches 
7rc/[67®(N)], where 7®(^) (39) denotes the shaping gain 
of the iV -sphere under uniform signaling. Since 7®(^) -* 
xe/6 monotonically from below as iV oo, 75 approaches 
unity as the dimension increases, thus confirming the "law of 
diminishing" returns discussed at the end of Section V. 

Curves showing the trade-offs between CER2, and shaping 
gain or between PAR2 and shaping gain for ND spherical 
constellations are easily obtained from the continuous approx- 
imations derived in Appendix B. However, as asserted by 
Forney and Wei [2], the best possible trade-offs are achieved 
by 2-D spherical constellations, i.e., by regions shaped as 
discs in two dimensions. Recall that Cartesian products of 
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Fig. 3. Comparison of the approximate and actual noimalizcd gain of 
spherical constellations based on various iV-diracnaiotial lattices. The dotted 
curves are obtained by applying continiKMis approximations for gain. 

basic regions achieve the same performance as the basic 
region itself. Thus, the best region R for use with nonuniform 
signaling in 2n dimensions is the n-fold Cartesian product of a 
2-D disc — a so-called polydisc [32] — because this region v.dll 
achieve a given value of shaping gain with least CERl, and 
PAR2. Thus, while nonuniform signaling will always cause 
a constellation expansion relative to imiform signaling with 
the same constellation, this constellation expansion is never 
greater and usually less than would be required under uniform 
signaling to achieve the same shaping gain. 

VII. Shaping with Binary Prefix Codes 
In this section, we study methods of achieving nonuni- 
form signaling schemes for the transmission of binary data. 
Assuming, as usual, that we wish to transmit the output of 
a memoryless binary equiprobable source, the most <*yious 
means of generating events with nonuniform probabilities is 
to parse the output of the source into codewords of variable 
length. Since the probability of occurrence of a codeword of 
length h is 2~'s shorter (more frequently occurring) code- 
words may be mapped to constellation points with low energy 
and longer (less frequently occurring) codewords may be 
mapped to points with high energy and, in this way, shaping 
gain may be achieved. This approach was suggested by a brief 
example in [13] and is discussed in greater detail in [14]. 

To ensure unique and complete parsing, it can be shown 
(e.g., [33, p. 297]) that the variable length binary codewords 
must form a complete binary prefix code, in which the M 
codeword lengths fi, t = 1, - • • , M satisfy 

^2-'' = 1. (30) 
i=i 

Although we wiU not always explicitly refer to them as such, 
all prefix codes considered in this paper are complete. 

A. Matched Codes 

The simplest example of the idea of mapping a prefix code 
to a constellation is probably the following. The output of 
a binary equiprobable memoryless source (the data source) 
is parsed into a sequence of blocks drawn from the set 
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Fig. 4. A simple nonuniform PAM scheme. 
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Fig. 5. A Donunifonn QAM scheme. 0 = 3, 

{0, 10, 11} and these output blocks are mapped onto a PAM 
constellation as shown in Fig. 4. Since P[0] = ^ and P[10] = 
-^[li] = \^ this scheme has an (average) bit rate of 1^ 
bits/T, This three-level schenie is quite similar to a partial- 
response scheme, but since any level is available for use during 
any signaUng interval (i.e., different transmitted symbols axe 
independent), the data rate is greater. If we place two sudi 1-D 
schemes in quadrature, we obtain the 2-D scheme shown in 
Fig. 5, which achieves an (average) bit rate of 3 bits/T. This 
2-D scheme provides a shining gain of 4/3 = 1.25 dB, with 
^ a CER2 of 9/8 = 1.125 and a PAR2 of 2. The overaU gain 
(including the discretization factor ^d) is 7/6 = 0.67 dB. 

The use of a binary prefix code will not, in general, produce 
an optimal nonunifonn scheme unless the constcUation is 
"matched" to the code. A constellation ft is said to be 
matched to some binary prefix code if, for some A > 0, a 
Maxwell-Boltzmann distribution with parameter A, induces 
probabilities on the constellation points that are all integral 
powers of two. This means that for some A > 0, 

€'^M^/Z(X) = 2"^('") (31) 

for all r 6 fi, where /(r) is a positive integer. The matching 
condition (31) is trivially satisfied when A = 0 by any 
constellation of size 2^ However, for positive A, the matching 
condition is strong, and we expect relatively few constellations 
to satisfy it. 

Note that, in Fig. 5, each constellation point conveys either 
two, three or four bits (with the outer points conveying more 
bits than the inner points). In particular, each point conveys 
at least two bits. This implies that we may consider this 
scheme to consist of a fixed-rate "primary" channel, conveying 
two bits per symbol, and a variable-rate "secondary" channel, 
conveying an average of one bit per symbol. In general, a 
binary prefix code with codeword lengths {h < h < " • < 
Im} assigned to an ND constellation of size M will produce 
a signaling scheme with an overall normalized average bit 
rate 0 = {2IN)Y,^i ^i^'^'- The fixed primary channel rate 
is (3p = 2li/N, while the variable secondary channel rate is 
0s =P-pp. It is quite possible to have > jSp, so the names 
primary and secondary do not necessarily refer to relative 
bit rates. In most practical circumstances, we will select 
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Fig, 6. A nonuniform OAM scheme, p = 3.5. 
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Fig. 7. A nonunifonn QAM scheme. 0 = 2.75. 

A < 0p' Note also that since the primary channel operates at 
a fixed rate, it can operate as a "standard" channel, and is not 
affected by the system problems associated with variable rate 
transmission. The "opportunistic secondary channels" of [2] 
and the **in-band coding method" of [21] (see also [17]-[20]) 
are examples of nonuniform signaling schemes that use prefix 
codes to separate data into primary and secondary chaimcls, 
although these schemes are not described in terms of prefix 
codes. 

Other examples of 2-D constellations matched to prefix 
codes are shown in Figs. 6 and 7. The scheme of Fig. 6 was 
used by Forney and Wei [2, Fig. 7(b)] to Ulustrate the notion 
of an opportunistic secondary channel, while the scheme of 
Fig. 1 is based on 7 points of lowest energy in the 2-D 
hexagonal lattice A^. A limited search turned up several 
additional examples in higher dimensions. For example, the 
constellation containing the origin and the first shell of the 
lattice Z?4 in four dimensions has theta series 1 + 24x and is 
matched to a binary prefix code having one codeword with 
two bits and 24 codewords with five bits. 

B. Huffman Codes 

Although, in general, a complex binary prefix code will not 
match a constellation in the sense of (31), we nevertheless 
expect prefix codes to provide shaping gain. Given an iNTD 
constellation of size in which the tth point u has 
eneigy (Inlp, 1 < i < the optimal (gain-maximizing) 
complete binary prefix code with codeword lengths 1 < 
t < |f2|, would maximize the quantity / = (2^ - l)/£?, where 
0 = (2/A')Ei li2''^ and =3 {2/N)'£, \\ri\\^2-'^, subject 
to the constraint (30). Unfortunately, short of searching all 
complete binary prefix codes with |Q| codewords, we know of 
no general method for finding the optimal prefix code. 

Rather than attempting to find the optimal code, we have 
taken the approach of finding approximations to the optimal 
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Maxwell-Boltzmann distribution with distributions in which 
all probabilities are positive integer powers of 1/2. (Stubley 
and Blake consider a more general matching problem in 
[34]-) We refer to such approximations as "dyadic approxi- 
mations" to the Maxwell-Boltzmann distribution. To find the 
"best" dyadic approximation, we need a measure of distance 
between the Maxwell-Boltzmann distribution and its dyadic 
approximation. A commonly used measure of distance be- 
tween two probability distributions P (with probability masses 
{pi, • • * , Pm}), and Q (with probability masses {qu-'^QM}) 
is the relative entropy of P with respect to Q: 

When Q is dyadic, so that 5i = 2~'', 

M 

D{P, Q) = Y}'Pi - HiP) (32) 

t=l 




Where H(P) = -Hi Pi loga Pi is the entropy of F. 

From the point of view of source coding, D(P, Q) repre- 
sents the redundancy of a source code used to represent the 
output of a discrete memoryless source with alphabet of size 
M and distribution P. As is well known, the redundancy (32) 
is minimized by the Huffman procedure [35]. Furthermore, 
the Huffman procedure always results in a complete prefix 
code. The existence of an algorithm for minimizing D{P, Q) 
is our primary motivation for choosing this particular measure; 
in^teed, other measures may be more naturally suited to the 
problem. Nevertheless, as will become evident, this approach 
of minimizing D(P, Q) leads to excellent gain values that can 
be made to approach the ultimate shaping gain. 

To illustrate their performance, we have computed dyadic 
approximations to the Maxwell-Boltzmann distribution for 
two constellations based on Z^. The two constellations were 
diosen quite arbitrarily: one consists of the 21 points of 
least enei^y in Z"; the other consists of the 121 points of 
least energy. The results are illustrated in Fig. 8, and were 
obtained by varying the Maxwell-Boltzmann parameter A 
from zero through positive values. As expected, the dyadic 
approximations (marked with a triangle for the 121-point 
consteUation and a square for the 21-point constellation) 
have lower gain values than those obtained from the optimal 
Maxwell-Boltzmann distribution (the solid curves); however, 
the gain values do follow the general trends obtained for 
the Maxwell-Boltzmann distribution. Due to the flexibility 
afforded by having a larger number points, the larger constel- 
lation has a greater number of different dyadic approximations 
to the Maxwell-Boltzmann distribution. 

The fact that these dyadic approximations follow the same 
general trends obtained for the Maxwell-Boltzmann distri- 
bution suggests the following algorithm for designing prefix 
codes to achieve shaping gain. Given an N-D constellation, 
we proceed as follows. 

1) From the constellation theta series, we numerically de- 
termine the value of the Maxwell-Boltzmann parameter 
A that maximizes gain G (8). Call this value Aopt- 



-r 

^ 0 (bits/2D channel-use) 

Fig 8 niustraUng the performance of dyadic approximations to the 
Maxwell-Boltzmann distribution obtained from the Huffman procedure. 

2) Using Aopt, we generate a list of Maxwell-Boltzmann 
probabilities 

p. ^ p{ri) = exp (-Aopt l|n|l^)/-^(Aopt) 

for all u e 

3) We apply the Huffman procedure to the list pi to obtain 
a complete binary prefix code. The performance of this 
code is then evaluated. 

Note that, in general, the Huffman procedure does not 
result in a unique code. We have chosen the version of the 
Huffman procedure, described in [36, p. 68], that results m 
least variation among the codeword lengths. Note also that we 
have chosen Aopt to maximize total gain. Since the coding gam 
is fixed for a given lattice, this is equivalent to ma ximizin g the 
product 7«74- It is important to note that this is not equivalent 
to maximizing the shaping gain 7* since this, in principle, can 
be accomplished by making A arbitrarily large, 

We have applied this procedure to spherical.constellations 
based on the 2-D lattice and its translate (Z + 1/2) , 
as well as the 2-D hexagonal lattice A2. The results bx^ 
given in Tables l-III. Each consteUation Q consists of the \n\ 
points of least energy drawn firom the corresponding infinite 
lattice (or translate). Shown in each table are the primary and 
secondary bit rates (J3p and 0s) obtained from the Huffman 
code. Each table lists the parameter 7.7<i, obtained by dividmg 
the total gain of the signaling scheme by the coding gam of 
the lattice upon which it is based. The 2-D peak-to-average 
energy ratio PAR2 and the 2-D consteUation expansion ratio 
CER2 are listed. In addition, the parameter iVeff , the "effective 
dimension," is listed. We define the effective dimension of a 
shaping scheme to be the smaUest dimension N for which the 
shaping gain of an JV-sphere 7®(^) (39) (properly multiplied 
by 7d) meets or exceeds the shaping gain provided by the 
scheme in question, i.e.. 

As can be seen from the tables, very satisfactory shaping gain 
values with effective dimensions numbering in the hundreds 
of dimensions, are obtained from these Huffman prefix codes. 

The shaping gains obtained from these Huffman codes seem 
to be the highest ever reported, exceeding those reported in 
[8, Table IV]. As previously noted, however, maximization 
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TABLE I 

Performance of Huffman-Coded Signal Constellations 
Based on 



ini 


0p 


^5 


-ys^d (dS) 




PAR2 


CER2 


9 


2 


1.000 


0.669 


47 


2.000 


1.125 


13 


2 


1.125 


0.834 


95 


3.765 


1.490 


21 


3 


0.688 


1,030 


111 


3.200 


1.630 


25 


3 


0.906 


1.039 


80 


4,357 


1.667 


29 


3 


1.078 


1.069 


78 


4.347 


1.717 


37 


3 


1.359 


1.172 


120 


4.025 


1.803 


45 


3 


1.641 


1.39 


159 


4.333 


1.804 


49 


3 


1.656 


1.249 


173 


5.285 


1.943 


57 


3 


1.719 


1.263 


189 


5386 


Z165 


61 


4 


0.922 


1.259 


139 


4.923 


2.012 


69 


4 


1.020 


1.275 


149 


5.120 


2.127 


81 


4 


1.281 


1.289 


135 


5.327 


2.083 


89 


4 


1.291 


1.301 


152 


5.517 


2.273 


97 


4 


1.389 


1.288 


124 


5.724 


2315 


101 


4 


1.667 


1.285 


103 


5.182 


1.988 


109 


4 


1.708 


1.301 


115 


5368 


2.085 


113 


4 


1.710 


1.303 


117 


5.679 


2.159 


121 


4 


1.780 


1.318 


129 


5.573 


2.202 


129 


4 


1.784 


1.323 


136 


6.015 


2341 


137 


5 


1.007 


1340 


145 


5.291 


Z131 


145 


5 


1.077 


1359 


171 


5.550 


Z148 


149 


5 


1.078 


1360 


174 


6.042 


2.205 


161 


5 


1.119 


1.367 


184 


5.998 


Z316 


169 


5 


1.168 


1.369 


184 


6.031 


2350 


177 


5 


1.256 


1373 


186 


5.784 


2316 


185 


5 


1.304 


1377 


190 


6.124 


2341 


193 


5 


1319 


1.379 


194 


6376 


Z417 


197 


5 


1336 


1379 


192 


6.611 


Z438 



of shaping gain 7a itself is not our aim; rather, we have 
attempted to maximize the combination 7a7<f. Further numeri- 
cal calculations obtained jQrom dyadic approximations to the 
Maxwell-Boltzraann distribution with parameter A > Aopt 
show that, in some cases, the effective dimension N^r can 
be made to increase significantly above the values shown in 
Tables I-III. Note also that PAR2 and CER2 values shown 
in Tables I-III are all quite reasonable, especially when com- 
pared to the PAR2 and CER2 of large-dimensional \bronoi 
constellations [6]. The PAR2 and CER2 can, in principle, 
be improved by sacrificing some shaping gain. Indeed, nu- 
merical calculations show that dyadic approximations to the 
Maxweil-Boltzmann distribution with parameter A < Aopt 
will, in general, result in improved PAR2 and CER2, with 
some corresponding sacrifice in overall gain. We have also 
applied this procedure to multidimensional constellations, with 
similar results. However, since many multidimensional con- 
stellations are best implemented as coset codes (see [10] and 
[11]), it may be preferable to use a nonuniform signal point 
selection scheme that is suited for a coset code (as described 
in the next section) rather than a direct mapping of the words 
of a prefix code onto the constellation points. 

It follows from standard arguments in information theory 
(e.g., [1, Section 5.4]) that the redundancy of the optimal code 
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TABLE II 



Performance of Huffman-Coded S[gnal CoNSTELtATiONS 
Based on (Z + 1/2)^ 





/3 
Pp 


/3 

Ps 






PAR2 


CER2 


12 


2 


1.000 


0.670 


47 


2.500 


1.500 


16 


2 


1.375 


0.757 


37 


3.429 


1.542 


24 


3 


0.844 


1.045 


93 


3,714 


1.672 


32 


3 


1.094 


1.136 


130 


4-121 


1-874 


44 


4 


0.734 


1.123 


58 


3.791 


1.653 


52 


4 


0.820 


1.190 


85 


4.199 


1.841 


60 


4 


0.969 


1.232 


104 


4.862 


1.916 


68 


4 


1.086 


1.253 


113 


4,979 


2,002 


76 


4 


1.266 


1.289 


138 


4.848 


1.976 


80 


4 


1.395 


1.320 


172 


4.851 


1.902 


88 


4 


1-412 


1.334 


202 


5.198 


2.067 


96 


4 


1.432 


1.341 


219 


5.911 


2.224 


112 


4 


1.691 


1.340 


175 


5.239 


2.167 


120 


4 


1.828 


1.353 


185 


5.358 


2.112 


124 


5 


0.945 


1.376 


232 


5.502 


2.012 


140 


5 


0.955 


1.385 


268 


5.747 


2.257 


148 


5 


1.083 


1.397 


293 


5.514 


2.183 


156 


5 


1.157 


1,406 


328 


5.716 


2.186 


164 


5 


1.166 


1.410 


351 


5.921 


2.284 


1 TO 
1 /Z 


f 

J 




1,414 


J04 


o.l4o 


2.303 


180 


5 


1.224 


1.417 


384 


6369 


2.408 


188 


5 


1.232 


1.418 


390 


6.558 


2.500 


192 


5 


1.245 


1.416 


373 


6.719 


2.531 


208 


5 


1378 


1.398 


247 


6.297 


2.501 


216 


5 


1.449 


1389 


209 


6.554 


2.473 


232 


5 


1.511 


1.385 


193 


6.633 


2.543 


240 


5 


1.654 


1389 


192 


6.172 


2.383 


248 


5 


1.726 


1.398 


209 


6.041 


2.343 


256 


5 


1.733 


1.400 


214 


6.171 


2.407 



for a discrete memoryless source can be made to approach zero 
by considering Cartesian products of the source. This implies 
that the relative entropy between the Maxweil-Boltzmann 
distribution and its dyadic approximation can be made to 
approach zero by considering Cartesian products of the basic 
constellation. Convergence in relative entropy implies L\ 
convergence of the probabilities [1, Section 12.6]; hence, the 
performance obtained from our dyadic approximations can be 
made to approach arbitrarily closely to the performance ob- 
tained by using the optimal Maxweil-Boltzmann distribution. 
Numerical calculations confirm the performance improvement 
obtained by working with Cartesian products of the basic 
constellations. 

VIII. Coded Nonuniform Signaling 

In this section we study how optimal nonuniform signaling 
fits into the general framework of coset codes, first introduced 
by Calderbank and Sloane [12] and extensively studied by 
Forney [10], [11]. 

A. Memoryless Signal Point Selectors 

A signaling scheme based on a coset code has two com- 
ponents as shown in Fig. 1. A coset code C, based on the 
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TABLE III 

Performance of Huffman-Coded Signal Constellations 
Based on A'2 



lOI 


Pp 


Ps 


IS id y^^J 




PAR2 


CER2 


7 


1 


0.750 


0.423 


27 




1.041 


13 


2 


1.031 


0.732 


62 




1 590 


19 


2 


1.453 


0.884 


62 




1 7*^^ 
1. / JJ 


31 


3 


1-187 


1.047 


59 




1 7ni 


37 


3 


1.477 


1.142 


81 


3.815 


1 AAiT 
1 ,OOZ 


43 


3 


1-668 


1.242 


157 


4.531 




55 


3 


1,754 


1.292 


268 




2.038 


61 


4 


1-006 


1.304 


214 




1.899 


73 


4 


1.049 


1.324 


273 






85 


4 


1.449 


1.286 


117 


A CCS 




91 


4 


1.473 


1.296 


126 


5.376 




97 


4 


1.557 


1.314 


143 


D-*fyo 




109 


4 


1.695 


1.330 


155 






121 


4 


1.766 


1.337 


161 


5-472 


2.224 


127 


5 


0.851 


1.344 


163 


5.992 


.2.201 


139 


5 


0.912 


1.351 


171 


5.907 


2.308 


151 


5 


1.078 


1-368 


191 


5.564 


2.236 


163 


5 


1.159 


1.379 


211 


5.809 


1.282 


169 


5 


1.167 


1,380 


214 


6.448 


2,352 


187 


5 


1.280 


1.379 


199 


6.082 


2.407 


199 


5 


1.441 


1.386 


200 


5.772 


2.291 


211 


5 


1.487 


1.393 


215 


6.134 


2.352 


223 


5 


1.526 


1.396 


222 


6.395 


2.420 



partitioning of a lattice A (possibly translated by some constant 
vector) into the cosets of sublattice A', produces a sequence 
of sets of channel symbols, drawn from the alphabet of the 
cosets of A' in A. The actual transmitted constellation point 
is determined by the signal point selector S. As discussed in 
Section H, both the coset code C and the signal point selector 
S contribute to the transmission of data. It is important to note 
that, as nonuniform signaling is a shaping technique, for the 
schemes we propose only the signal point selector S is affected. 
The coset code C is unchanged relative to well-known schemes 
such as those of Ungerboeck [37]. 

The simplest type of signal point selector is memoryless, 
or time invariant. When S is memoryless, each time a coset 
of A' is made available to S, the subset from which the 
constellation point is selected is the same, and the choice is 
made independently. For example, the signal point selector 
could always choose from the K points of least norm in each 
coset For a block coset code C based on a 2-D lattice A, 
this would result in a polydisc-shaped constellation. Cubic 
constellations are achieved if S always selects from a square- 
shaped region in each coset. 

More complicated signal point selectors have memory, i.e., 
they are time-varying. To achieve generalized cross constella- 
tions [2], Voronoi constellations [6], or indeed constellations 
based on any region that is not a Cartesian product of lower- 
dimensional regions, the signal point selector must be time- 
varying. The block shaping codes of Calderbank and Ozarow 
[7] and the trellis shaping codes of Forney [8] are examples 
of time-varying signal point selectors. For coset codes based 



on 2-D lattices, and assuming uniform signaling, time-varying 
signal point selectors are necessary to achieve shaping gains 
that exceed the shaping gain of a 2-D disc. Indeed, the best 
possible shaping gain in A'' -space is achieved by an TV-sphere, 
a region not decomposable as a Cartesian product of lower- 
dimensional regions. 

As discussed in Section VI, under nonuniform signaling, the 
best regions with which to shape a constellation are polydiscs, 
as these achieve a given shaping gain with least shaping 
constellation expansion ratio CER25 and least peak-to-average 
energy ratio PAR2. Since polydiscs are by definition a product 
of 2-D discs, polydisc-shaped constellations can be achieved 
by a memoryless signal point selector combined with a coset 
code based on a 2-D lattice. We focus our attention, therefore, 
on memoryless nonuniform signal point selectors. 

B, Coset Codes 

The coset codes considered in this paper are based on any L- 
way partition of A/ A' of an NY> lattice A mto the L cosets of 
a sublattice A'. We focus our attention on binary coset codes, 
where L — 2*+'", although generalization to nonbinary coset 
codes is straightforward. 

Let C be a binary rate-A:/(A: -f- r) encoder that takes in k 
bits per ND and puts out fe -1- r coded bits. These coded bits 
can be used to select one of the 2*^+^ cosets of A' in A. 
The resulting coset code is denoted C(A/A^; C), When the 
binary encoder C is a block code, the coset code C(A/A'; C) 
defines a finite-dimensional sphere packing; often this sphere 
packing is actually a lattice. When C is a convolutional code, 
the resulting coset code is a trellis code. 

Let us denote a set of L = 2'=+'- = 1A/A'| coset leaders 
of the cosets of A' in A by {ci, C2, • - • ,cz,}. Each time the 
memoryless signal point selector S is presented^with tth coset 
a + A', it selects some point for transmission. The set of 
all possible points drawn from the tth coset forms the tth 
constellation Q.i. 

Given that S is presented with the ith coset, if a point 
ri € 4- A' is selected with probability p(rt), then 
we can determine a normalized average bit rate Pi = 
-^Er en log2b(^0]/^ and a normalized average 

energy % ~ 2Er.€n, PiriWiW^/^ for the tth consteUation. 
If the coset code' C * selects the ith coset with probability 
P[i] (and usually P[i] = l/L), then the normalized average 
number of bits taken in by the signal point selector S is 
«(S) = y;f^i P[i]0i, and the normalized average energy is 
E = J2r^l P[t]Ei. Since the code C takes in k bits per N 
dimensions, or k{C) = 2k/N bits per two dimensions, the 
overall normalized bit rate (3 = /c(C) -f k{S). If the coset 
code C has minimum squared Euclidean distance d^i^^, then, 
from (8) the gain G of the coded modulation scheme may be 
written as 

^~ \v{a')vn)\ 6e r 

= 7.(Ch,(S)7d(^)- 

Here, as in (9), we have separated the total gain into the 
product of the coding gain 7c(C) of the coset code C [10], 
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the overall shaping gain 7«(S) provided by the signal point 
selector S, and a discretization factor ^ydiP)^ 

C. Continuous Approximations 

Suppose now that each constellation fi,- is obtained from 
the intersection of the tth coset of A' with the same finite 
region R. Further, suppose that the raemoryless signal point 
selector S selects each point in Q.i with a Maxwell-Boltzmann 
distribution, i.e., given that S is presented with the tth constel- 
lation, a point ri € Q.i is selected with probability p(ri) = 
exp(-A||r^||2)/Zi(A), where A is fixed for all constellations. 

Using the same continuous approximation principles as in 
Section VI, the average energy Ei and bit rate 0i for each 
subconstellation can be estimated via (21) and (23) from a 
continuous Gaussian distribution, truncated to the region R. 
It follows that each subconstellation supports approximately 
the same bit rate, at the same cost in average energy. Thus, 
independently of the coset code C and the probability with 
which each coset is selected, k{S) ^ 0i{R, A) and E ^ 
Ei{R, A). 

Using these estimates for bit rate and average energy, the 
shaping gain 7*(S) is estimated by 



T.CR, A) = 



22/7(R. X)/N 

6E(R, A) ' 



which is the same expression as (25). Similarly, we 
find that CER2, the 2-D constellation expansion ratio, is 
approximately the produce of a coding constellation expansion 
ratio CER2c(C) and a shaping constellation expansion 
ratio CER2a(S). In terms of the normalized redundancy 
p(C) = 2r/N of the binary encoder C, we have 

CER2e(C) = 2^^)^^l^, 
V{A2) 



while 



CER2,(S) = 2^(^'^) 



ViR2) 



exactly as in Section VI. Approximations for the 2-D peak-to- 
average energy ratio PAR2 also lead to expressions identical 
to those given in Section VI. 

In general, to the accuracy of the continuous approxima- 
tion, the shaping gain 7«(R, A) achieved by our memoryless 
nonuniform signal point selector S is completely independent 
of the choice of coset code C. Trade-offs involving shaping 
constellation expansion ratio CER2h(S) and PAR2 are also 
completely independent of the choice of coset code C. As 
noted, it is desirable to have constellations shaped as polydiscs, 
since these achieve a given shaping gain with minimum 
CER2s and minimum PAR2. Therefore, a nonuniform signal- 
ing scheme based on a multidimensional lattice A is perhaps 
best implemented as a coset code involving the constituent 
2-D lattice A2, in which the ith constellation Qi is circular 
and the signal point selector S chooses from fli according to 
a Maxwell-Boltzmann distribution. 





c 


0 






• 


« 






11 


11 








g 


A 


• 


• 


# 


• 


11 


0 


0 


11 


D 


c 


D 


c 


• 


• 


• 


• 


10 


0 


0 


10 




A 


B 






• 


• 






10 


10 





Fig. 9. The sequence of subconstcllations {A, B, C, D} is determined by 
an Ungcrboeck code. Signal point selection within each subconstellation is 
pcrfoTmcd using a simplex prefix code. 



D. Memoryless Signal Point Selection with Huffman Codes 

As in the case of uncoded transmission, probably the 
simplest method of selecting points from the coset sequences 
generated by a coset code is through a complete binary 
prefix code. Because we would like our consiellations to be 
polydiscs, and we have restricted our attention to binary coset 
codes, we focus on coset codes based on the 2-D lattice Z^. 

A simple example of combining a nonuniform memoryless 
signal point selector with a trellis code is shown in Fig. 9. 
The trellis code is a simple four-state Ungerboeck code [37] 
based on a translate of the four-way partition T?I2T?\ this 
code provides a coding gain 7^ = 2 = 3.01 dB. Each 
subconstellation (labeled A, B, C, or D) consists of a- single 
inner point at squared Euclidean distance 1/2 from the origin 
and two outer points at squared Euclidean distance 5/2 from 
the origin- By using the prefix code {0, 10, 11}, the signal 
point selector ciiooses the inner point with probability 1/2 and 
each outer point with probability 1/4, The transmitted rate 0 
for this scheme is 2,5 bits with a primary channel rate of 2 
bits and a secondary channel rate of 0.5 bits. The shaping gain 
7* = 0.994 dB. The overall gain G = 7c7*7d == 3.16 dB. 

The four-way partition 7^/27^, translated as in Fig. 9, 
yields four cosets with identical weight distributions. The 
weight enumerator for these cosets begins 

en(x) = 3:^/2 ^ 2x^/2 + x«/2 ^ 2x^2/2 ^ 2a;^V2 + . . , , 

As in Section VII, we have computed dyadic approximations 
to the optimal Maxwell-Boltzmann distribution by applying 
the Hufi&nan procedure. The results are given in Table IV. We 
have assumed a coset code C{T?I2T?\ C) with a normalized 
bit rate «(C) = 1. This bit rate is included in the primary 
rate (3^ of Table IV. The overall constellation, of size jQ], is 
the union of four subconstellations, each of size |Q|/4. As in 
Tables I-III, the gain 7a 7^ and the effective dimension N^^ are 
listed, along with PAR2 and CER2,. Note that CER2a(C) = 2. 
As before, very satisfactory shaping gains are achievable via 
these prefix codes, with iVeff numbering in the hundreds of 
dimensions for the larger constellations. 

Although the prefix codes for the smaller constellations 
have smaller iV^ff, as discussed at the end of Section VII, 
applying the Huffman procedure to Cartesian products of these 
subconstellations can improve the gain. Note, however, that the 
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TABLE IV 

Performance of Huffman-Coded Ratb 1/2 Coset Codes 
Based on 7^/22^ 
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PAR 2 


CEK25 


12 
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0.500 


0.149 


17 


1.667 


1.061 


16 
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0-750 


0.378 


23 


2.571 


1-189 


24 


3 


0.375 


0.555 


18 


2.364 


1.157 


32 


3 


0.562 


0.798 


34 


2.833 


1.354 


44 


3 


0.906 


0.948 


46 


3.333 


1.467 


52 


3 


1.281 


1.064 


60 


3-013 


1.337 


60 


3 


1.328 


1.128 


87 


3.771 


1.494 


68 


3 


1.375 


1.168 


112 


4.075 


1.639 


76 


3 


1.562 


1.155 


82 


3.892 


1.608 


80 


3 


1.570 


1.174 


94 


4.232 


1,684 


88 


4 


0.883 


1.221 


102 


3.695 


1.491 


96 


4 


0.906 


1.247 


126 


4.207 


1,601 


112 


4 


0.938 


1.289 


192 


4.426 


1.827 


120 


4 


1.180 


L295 


158 


4.186 


1.655 


124 


4 


1.184 


1.303 


173 


4.641 


1.706 


140 


4 


1.207 


1.329 


239 


4.818 


1.895 


148 


4 


1.213 


1340 


280 


5.036 


1-995 


156 


4 


1.432 


1.292 


125 


4.647 


1.807 


164 


4 


1.455 


1.303 


137 


4.770 


1.869 


172 


4 


1-494 


1.306 


137 


5.011 


1-908 


180 


4 


1.656 


1.308 


127 


4.634 


1.785 


188 


4 


1-705 


1.321 


140 


4-649 


1.802 


192 


4 


1.707 


1-326 


147 


4-807 


1.838 


208 


4 


1.717 


1.342 


174 


4.950 


1.977 


216 


4 


1,728 


1.345 


179 


5.387 


2-038 


232 


5 


0.924 


1.342 


153 


4-961 


1.911 


240 


5 


0.941 


1.348 


162 


5.042 


1.953 


248 


5 


0.986 


1.353 


169 


5.023 


1.956 


256 


5 


1.020 


1.357 


174 


5.040 


1.973 



signal point selector S is no longer memoryless in this case. 
Other partitions of T?, e.g., the 8-way partition 1} J2RJ? or 
the 16-way partition Z^/At^, allow the use of more powerful 
coset codes, many of which are listed in [10, Tables IV, V, 
IX, X, XI]. Unlike the 4-way partition Z^/2Z^, the cosets 
in these partitions do not all have the same theta series. For 
example, the 8-way partition of -f- (1/2, 1/2), has two 
classes of cosets [typified by A = 2Rt^ -h (1/2, 1/2) and 
B - 2K1? -h (3/2, 1/2)] with weight enumerators 

e^(x) - ^ ^9/2 ^ 2x^^/2 ^ 3^25/2 ^ . . . 

es(rr) = 2x^/2 + 2x^=^/2 ^ ^^^^n ^ <^^^Vi + . . . , 

respectively. Each class consists of four different cosets. 
Similarly, the 16-way partition of T? 4- (1/2, 1/2) has three 
classes of cosets [typified by >l = AT? (1/2, 1/2), B = 
4Z2 + (3/2, 1/2), and C = 47^ -h (3/2, 3/2)] with weight 
enumerators 

e^(x) = x^/2 + 2x2V2 + 2x^^/2 -f x^^/2 + . . • 

^b{x) = ^ X^3/2 + x29/2 + x3^/2 _^ . . . 

ec(x) = x^/^ 4- 2x^^/2 ^ ^25/2 ^ 2x^5/2 + . • • , 

respectively. Class A consists of four cosets, class B consists 
of eight cosets, and class C consists of four cosets. We have 



found good dyadic approximations to the Maxwell-Boltzmann 
distribution for these partitions, with results similar to those 
given in Table IV. Many different schemes may be designed 
by combining the various signal point selectors obtained. For 
example, it may be advantageous for different cosets to convey 
different numbers of secondary channel bits. 

In general, by applying the Huffman algorithm to obtain 
dyadic approximations to Maxwell-Ek>ltzmann distribution, 
we have been able to obtain a variety of different schemes. 
By varying A, schemes that trade off gain for improved PAR2 
and CER2a are easily obtained. 

IX. Discussion AND Conclusion 

From the point of view of coded modulation, we have seen 
that nonuniform signal point selection is an energy-minimizing 
or shaping operation. When constellation points are selected 
with Maxwell-Boltzmann probabilities, the ultimate in shaping 
gain performance can be achieved in any dimension. Dyadic 
approximations to the optimal Maxwell-Boltzmann distribu- 
tion are easily obtained by applying the Huffman procedure. 
The performance of the resulting shaping schemes is often 
close to optimum, with effective dimensions numbering in 
the hundreds, and can be made to approach the optimum 
by considering Cartesian products of the basic constellations. 
By varying the Maxwell-Boltzmann parameter, trade-offs be- 
tween shaping gain, 2-D constellation expansion ratio or 2-D 
peak-to-average energy ratio are easily accomplished. In a 
sense, the implementation complexity for these schemes is 
trivial, since data to constellation point mappings (and vice " 
versa) are easily performed by table lookup. Furthermore, 
these schemes are easily incorporated into well-known lattice- 
type coded modulation schemes. All of these properties make 
nonuniform signaling very attractive. 

The principal drawback, as pointed out at the outset, is 
the variable bit rate. While only the secondary channel data 
are subject to the problems associated with buffer under- 
and overflow and the insertion and deletion of bits in the 
decoded bit stream, these problems may be acceptable only 
in certain applications, e.g., for the transmission of internal 
control signals. Left unsolved, these problems will tend to 
limit the broad applicability of nonuniform signaling. Solving 
the system problems associated with nonuniform signaling 
will certainly increase the complexity of implementation. 
Yet, in order to achieve the large shaping gains achieved 
with nonuniform signaling, uniform signaling schemes will 
themselves tend to become quite complex (see [8, Table IV]). 
It remains an open problem to evaluate and compare these 
complexities. 

Appendix A 

Error Coefficients for Baseline Constellations 
In this appendix, we compute the average nearest neighbor 
multiplicity (error coefficient) for a cubic constellation of side 
M drawn from the lattice Z^. This problem is easily solved 
using the notion of a nearest neighbor enumerator. 

Given a finite constellation n, recall that iV^in (r) denotes 
the number of points of H at distance rf^in from the point r. 
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Define A{x) = T.^^^ min{r). ^(^j ,-3 ^ polynomial 
with integer coefficients that we call the nearest neighbor enu- 
merator for For example^ a simple 1-D PAM constellation 
with M points has A{x) = 2a; + (M - 2)x2, indicating that 
two points of the constellation have a single nearest neighbor, 
while M — 2 points have two nearest neighbors. 

Assuming uniform signal point selection, it is easy to see 
that average nearest neighbor multiplicity N is given by 
A'{1)/A{1) where A'{x) = dA{x)/dx. Thus for the Af-PAM 
example of the previous paragraph, JV^ = 2(1 — 1/M). 

It is easily seen that the nearest neighbor enumerator for $7", 
the n-fold Cartesian product of n with itself, is A{x)'^. The 
average nearest neighbor multiplicity of the Cartesian product 
is then n>l(ir-iA'(l)M(l)« = nA'{l)/A{\)i in other 
words, taking the n-fold Cartesian product of a constellation 
with itself multiplies the average nearest neighbor multiplicity 
by n. Since an iV-D cubic constellation of side Af is the iV- 
fold Cartesian product of simple 1*D PAM constellations, we 
have N = 2N{1 - IM ) for such constellations. Fiuthermore, 
since M = 2^/^, we obtain - 2N{1 - 2-^/^) for our 
N-D baseline constellations. 

Appendix B 
Continuous AppROXiMAnoNS for 
Spherical Constellations 

In this appendix, we specialize the continuous approx- 
imations derived in Section VI to the case where R = 
Biv(i2), an JV-ball of radius R centered at the origin. Cubic 
constellations can be considered to be Cartesian products of 
ID **spherical constellations" and so are (by the separability 
of Graussian densities) a special case. By letting 00, we 
obtain continuous approximations for the case of an infinite 
constellation. 

Many of the expressions derived in this Appendix may 
be written in terms of the (normalized) incomplete Gamma 
function P(a, x), defined in [23] as 

P(a,x) ^ ^Tt-^e-Ut, 
A W Jo 

where we note that limx-^oo ^) = 1. 

For a finite N-D spherical constellation Q, in our estimates 
we choose the spherical radius R so that V{Bn{R)) = 
\Q\V{A), i.e., so that approximation (19) holds with equality. 

Energy: From (21) we obtain 

E{Bj,{R), A) - - . p(jv/2,Ai?^) ' 

Since E{Bn{R), 0) = 2R^/{N'^2), we find that the energy 
savings factor (29) is 

2XR^P{N/2, XR^) 



gE{Br,(Rh A) = 



(N + 2)P{N/2 + 1, Ai22) ■ 



(34) 



Bit Rate: The entropy of a continuous Gaussian random 
variable (with parameter A) truncated to Bn{R) is given by 

HiBf,{R), A) = Iog2 [{^/X)''^^P{N/2, XR^)] 

+NXE{Bn(R), A)/(2 In 2). 



Setting A = 0 yields 

H{Bn(R), 0) = log2 [ViBf,(R))] 

= \oe^[{^R^r/yr{N/2 + i)]. 

Combining these expressions gives an estimate for the nor- 
malized redundancy (27), namely. 



p{Br^{Rh A) = log2 



XR^ 



\T{N/2 + l)P{N/2, XR^)]y^ 
P(JV/2+l, XR^) 



P{N/2, Ai?2) 



log2 e. (35) 



The bit rate can be estimated via (23) or via (27). For a 
spherical constellation of size fi, we use the approximation? 
0{Bn{R). A) = (2/N) logs l^^l - PiBMiR), A). 

Shaping and Biasing Gains: Substituting (34) and (35) into 
(28) gives an estimate for the biasing gain. Multiplying the ' 

biasing gain by 7®(^) = ir(J\r/2 + l)/[6r(i^/2 + l)^/^], 
the shying gain of an iV-sphere under uniform signaling [2], 
yields an estimate for the shaping gain under nonuniform 
signaling. Explicitly, the shaping gain ja{BN(R), A) is 



m rn\ ^^ ^ ( P{N/2±i, XR^)\ 
^.iBsiR). A) = - exp ( p(r,/2^xR-) ) 

P(N/2, XR'^)WN)-i-i 
P(A^/2 + 1, Aii2) ■ 



(36) 



/31R2: The constituent 2-D constellation of Bp/{R) (as 
defined in Section HI) is a 2-D disc B2{R) (with peak energy 
R^) when N is even, and a square Bl(R) of side 2R (with 
peak energy 2R^) when N is odd; this may be expressed 
compactly by writing r2^(Biv(H)) = [3 - (-l)^]R^/2, 
Since the normalized average energy under uniform signaling 
is 2R^/{N -f 2), we have 

PAR2(B^(i?), 0) = (3 - {-1)^){N + 2)1 A. 

Under nonuniform signaling this PAR2 is increased by a factor 
of gs so that 

PAR2(B;v(iJ), A) 2P(JV/2+l, XR^) " ^^^^ 



CER2g: Under imiform signaling, a large even-dimensional 
spherical constellation induces a shaping constellation expan- 
sion ratio of CER2, = r(iNr/2 -h 1)^/^ (see [2]), while a 
large odd-dimensional spherical constellation has CER25 = 
(4/x)r(A^/2-hl)^^^. Under nonuniform signaling, this CER2s 
is increased by a factor of 2^^^'^^"^'^), so that 

CER2.(B.(«), A) = ^ + 4 + (-in. -4) 

XR^ 

' P{N/2, A«2)2/JV 

r P(Ar/2 + l,Afi^) 1 
^""^Y P(7V/2,Ai?2) J-(^^> 
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Limiting Case: When /?, oo, the truncated Gaussian 
random variable approaches a standard (untruncated) Gaussian 
random variable. For large R, the normalized average energy 
E(B;v(H), A) 1/A, independently of N. Similarly, the en- 
ergy savings factor gsi^NiR), A) ^ 2XR^/{N^2), the nor- 
malized redundancy /9(B^^(i?), A) ^ logg (Ai^2/er(^'/2 -h 
1)^/^). Thus the biasing gain 

76(Byv(i^), A) - 2e^(^72 + l)2/^7(^ + 2) 
= 7re/[67®W], 

where 

A^W ^ 7r(iV/2 + l)/[6r(iV/2 + l)^/^], (39) 

is the shaping gain of the iV-sphere under uniform signaling. 
The shaping gain -ysi^NiR), A) 7re/6, independently of 
the dimension N. Of course, the biasing gain depends on 
TV, and approaches unity zs N oo. For large values of 
Ry CER2(Bjv(-R), A) — * XR^/e for even-dimensional spher- 
ical constellations, and CER2{Bn{R), A) -> 4XR^/{7re) for 
odd-dimensional spherical constellations. Similarly, for even- 
dimensional spherical constellations, PAR2{Bj)/{R)^ A) — > 
XR^^ independently of N, Of course, this is to be expected 
because the peak energy value is approximately R^^ and 
the average energy is approximately 1/A. Similarly, for odd- 
dimensional spherical constellations, FAR2{B/^(R), A) -+ 
2AH^ because the peak energy value (in two dimensions) is 
approximately 2R^. 
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